Combining Substructures to Uncover The Relational Web
نویسنده
چکیده
I describe an approach to automatically convert web-sites into relational form. The approach relies on the existence of multiple types of substructure within a collection of pages from a web-site. Corresponding to each substructure is an expert that generates a set of simple hints for the particular collection. Each hint describes the alignment of some tokens within relations. An optimization algorithm then finds the relational representation of the given web site such that the likelihood of observing the hints from the relational representation is maximized. The contributions of the thesis will be a new approach for combining heterogeneous substructures in document collections, an implemented system that will make massive amounts of web data available to applications that use only structured data, and new search techniques in probabilistic constraint satisfaction.
منابع مشابه
Tightly Integrating Relational Learning and Multiple-Instance Regression for Real-Valued Drug Activity Prediction
We present a new machine learning approach for 3D-QSAR, the task of predicting binding affinities of molecules to target proteins based on 3D structure. Our approach predicts binding affinity by using regression on substructures discovered by relational learning. We make two contributions to the state-of-the-art. First, we use multiple-instance (MI) regression, which represents a molecule as a ...
متن کاملYaanii: Effective Keyword Search over Semantic Dataset
Nowadays data is disseminated in a number of different sources, from databases systems to the Web, from a traditional structured organization (relational) to a semi-structured (XML), up to the unstructured ones (text in Web documents). Although availability of data is constantly increasing, one principal difficulty users have to face is to find and retrieve the information they are looking for....
متن کاملایجاد نیمه خودکار مشاپ های سازمانی با استفاده از توصیفات معنایی
Mashups are next generation of web applications. A mashup is a lightweight web application that is created by combining information or capabilities from more than one existing resources to deliver a new and integrated experience to the user. Mashups introduce a new class of integration techniques in enterprises for implementing situational applications (i.e. applications that come together to s...
متن کاملWeb-based Information for Medical Tourism: Case Study of AriaMedTour Medical Tourism Company, Iran
Objective: As one of the well-known countries for medical tourism, Iran has the potential for growth in this industry and requires information and advertisements in online media and websites. This study aims to investigate the effectiveness of the content produced by the website of AriaMedTour Medical Tourism Company in informing tourists. Methods: This is an applied study that adopted an indu...
متن کاملInternet Marketing Strategies
The use of the Internet has increased in recent years remarkably. Companies employ the World Wide Web (WWW) to gather, disseminate and interchange information with actual and potential customers, and then Internet Technology seems to be served and applied as a strategic tool and affects strategies and practices of a firm such as Porter's competitive strategies. Many research findings confirm an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004